Random CapsNet forest model for imbalanced malware type classification task

نویسندگان

چکیده

Abstract Behavior of malware varies depending the types, which affects strategies system protection software. Many classification models, empowered by machine and/or deep learning, achieve superior accuracies for predicting types. Machine learning-based models need to do heavy feature engineering work, performance greatly. On other hand, require less effort in when compared that models. However, traditional learning architectures components, such as max and average pooling, cause architecture be more complex sensitive data. The capsule network architectures, on reduce aforementioned complexities eliminating pooling components. Additionally, based are data, unlike classical convolutional neural architectures. This paper proposes an ensemble model bootstrap aggregating technique. proposed method is tested two widely used, highly imbalanced datasets (Malimg BIG2015), the-state-of-the-art results well-known can used comparison purposes. achieves highest F-Score, 0.9820, BIG2015 dataset 0.9661, Malimg dataset. Our also reaches the-state-of-the-art, using 99.7% lower number trainable parameters than best literature.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Random Forest for Malware Classification

The challenge in engaging malware activities involves the correct identification and classification of different malware variants. Various malwares incorporate code obfuscation methods that alters their code signatures effectively countering antimalware detection techniques utilizing static methods and signature database. In this study, we utilized an approach of converting a malware binary int...

متن کامل

Random Forest Classification for Android Malware

Classification techniques such as Support Vector Machines, K-Nearest Neighbours, Decision Trees, Logistic Regression and Naive Bayes have widely been used in the area of intrusion detection research in the security community. They are predominantly used for behaviour based detection methods (anomaly detection methods). In this paper we exclusively apply the ensemble learning algorithm Random Fo...

متن کامل

Random Forest Based Imbalanced Data Cleaning and Classification

The given task of PAKDD 2007 data mining competition is a typical problem of learning from extremely imbalanced data set. In this paper, we propose a combination of random forest based techniques and sampling methods to identify the potential buyers. Our methods is mainly composed of two phases: data cleaning and classification, both based on random forest. Firstly, the data set is cleaned by t...

متن کامل

Random Projection Method for Scalable Malware Classification

In this poster a new approach for scalable behavioral based malware classification is presented. It is based on the random projection method which is an efficient, effective yet simple dimensionality reduction method. Interestingly, however, the random projection method has not – to the authors’ best knowledge – ever been investigated for its possible usefulness for the malware classification p...

متن کامل

Using Random Forest to Learn Imbalanced Data

In this paper we propose two ways to deal with the imbalanced data classification problem using random forest. One is based on cost sensitive learning, and the other is based on a sampling technique. Performance metrics such as precision and recall, false positive rate and false negative rate, F-measure and weighted accuracy are computed. Both methods are shown to improve the prediction accurac...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Computers & Security

سال: 2021

ISSN: ['0167-4048', '1872-6208']

DOI: https://doi.org/10.1016/j.cose.2020.102133